Added chat & completions api #10

ksriv001 · 2025-01-22T21:13:45Z

No description provided.

ksriv001 · 2025-01-23T19:25:17Z

edenai_apis/apis/tenstorrent/tenstorrent_text_api.py

+            "stream": stream,
+        }
+
+        base_url = "https://vllm-tt-dev-49305ac9.workload.tenstorrent.com/v1"


We should create an ingress route that would be standard for EdenAI vllm deployment. And the workload that I tested is currently in vllm-tt-dev in tenstorrent team, but I'll move it to EdenAI team.

I think we might need to add some taints on the LLMBoxes in k8 that we specifically need to reserve for Eden maybe?

ksriv001 · 2025-01-23T19:25:44Z

edenai_apis/apis/tenstorrent/tenstorrent_text_api.py

+            "max_tokens": max_tokens,
+        }
+
+        base_url = "https://vllm-tt-dev-49305ac9.workload.tenstorrent.com/v1"


We should create an ingress route that would be standard for EdenAI vllm deployment. And the workload that I tested is currently in vllm-tt-dev in tenstorrent team, but I'll move it to EdenAI team

rreece

Great progress. Here's some thoughts.

rreece · 2025-01-23T20:35:54Z

edenai_apis/apis/tenstorrent/tenstorrent_text_api.py

+            "stream": stream,
+        }
+
+        base_url = "https://vllm-tt-dev-49305ac9.workload.tenstorrent.com/v1"


edenai_apis/apis/tenstorrent/tenstorrent_text_api.py

rreece · 2025-01-23T21:00:10Z

edenai_apis/apis/tenstorrent/tenstorrent_text_api.py

+        }
+
+        base_url = "https://vllm-tt-dev-49305ac9.workload.tenstorrent.com/v1"
+        client = OpenAI(base_url=base_url,api_key=self.api_key)


I like that you are using the openai client. I'm not sure the client should be redefined on every call like this.

Remember that when the client is defined, that initializes a session. If we want sticky sessions to work. The same client needs to be re-used by the same user.

Note that in the example edenai-api by openai you shared earlier also uses the client, and the client is saved to the class as a member of self. I'm actually confused where the client is initilized:

https://github.com/edenai/edenai-apis/blob/aa32409084469b81be98989964b0ab82d7326ccc/edenai_apis/apis/openai/openai_text_api.py#L691

Can you figure out where the client is initialized, and we should probably do similar?

Also, just noting for comparison that the openai implementation of text__generation does not use the client and just uses the requests module, but I think we should use the client:

https://github.com/edenai/edenai-apis/blob/aa32409084469b81be98989964b0ab82d7326ccc/edenai_apis/apis/openai/openai_text_api.py#L472

Ahh, great point! You are right that if we want to leverage the sticky sessions/kv-cache reuse, then we probably shouldnt instantiate client again and again.

After looking a bit on how OpenAI did it, the client for the text__chat method in the OpenaiTextApi class is initialized in the OpenaiApi class constructor here:
https://github.com/edenai/edenai-apis/blob/aa32409084469b81be98989964b0ab82d7326ccc/edenai_apis/apis/openai/openai_api.py#L55

This happens during the instantiation of the OpenaiApi object in openai_api.py. The OpenaiApi class inherits from OpenaiTextApi, so the self.client object will be accessible to text__chat.

I'll try to modify along similar lines.

For the other note you mentioned about openai using request for text__generation and not the OpenAI client, our tenstorrent code is already using the OpenAI client for both text__chat and text__generation.

ksriv001 · 2025-01-23T21:53:06Z

@rreece One thing I am confused about is whether the name of this file be set to tenstorrent_settings.json rather than tenstorrent_template_settings.json?
I am asking this since the instructions here mention that the name of the file should follow the pattern provider_settings.json, and infact, this is what I have used while testing, and manually hardcoding the value of the api_key key in the json.
Do you know what it should be?

rreece · 2025-01-23T23:20:54Z

@rreece One thing I am confused about is whether the name of this file be set to tenstorrent_settings.json rather than tenstorrent_template_settings.json? I am asking this since the instructions here mention that the name of the file should follow the pattern provider_settings.json, and infact, this is what I have used while testing, and manually hardcoding the value of the api_key key in the json. Do you know what it should be?

Similar to the other examples from other providers, the name of the file saved to the repo should be tenstorrent_settings_template.json, but then it should be copied to tenstorrent_settings.json and the actual key added to during testing. But tenstorrent_settings.json should never be added to the repo. Only the template should be in the repo.

ksriv001 · 2025-01-23T23:27:56Z

One thing I am confused with is whether the name of this file

@rreece One thing I am confused about is whether the name of this file be set to tenstorrent_settings.json rather than tenstorrent_template_settings.json? I am asking this since the instructions here mention that the name of the file should follow the pattern provider_settings.json, and infact, this is what I have used while testing, and manually hardcoding the value of the api_key key in the json. Do you know what it should be?

Similar to the other examples from other providers, the name of the file saved to the repo should be tenstorrent_settings_template.json, but then it should be copied to tenstorrent_settings.json and the actual key added to during testing. But tenstorrent_settings.json should never be added to the repo. Only the template should be in the repo.

Cool, sounds like I have been doing it the right way then. Thanks for confirming.

ksriv001 added 8 commits January 22, 2025 13:06

Added chat api

8543fea

Cleanup

5487aa4

Sync changes

81b6875

Sync changes

aac84f1

Sync changes

3630fc4

Sync changes

139f716

Sync changes

a1b3927

Working text_generation

ec75946

ksriv001 self-assigned this Jan 22, 2025

ksriv001 changed the title ~~Added chat api~~ Added chat & completions api Jan 22, 2025

ksriv001 added 2 commits January 22, 2025 16:04

Updated url for dev deployment

ce0c3f1

Cleanup

fe9820b

ksriv001 requested a review from rreece January 23, 2025 19:20

ksriv001 commented Jan 23, 2025

View reviewed changes

rreece requested changes Jan 23, 2025

View reviewed changes

Instantiated client only once

f91ed17

ksriv001 added 2 commits January 23, 2025 14:06

Changed name of the base_url for chat and generation

e386e2e

Cleanup

758326c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added chat & completions api #10

Added chat & completions api #10

ksriv001 commented Jan 22, 2025

ksriv001 Jan 23, 2025

rreece Jan 23, 2025

ksriv001 Jan 23, 2025

ksriv001 Jan 23, 2025

rreece left a comment

rreece Jan 23, 2025

rreece Jan 23, 2025

ksriv001 Jan 23, 2025 •

edited

Loading

ksriv001 Jan 23, 2025

ksriv001 Jan 23, 2025 •

edited

Loading

ksriv001 commented Jan 23, 2025

rreece commented Jan 23, 2025 •

edited

Loading

ksriv001 commented Jan 23, 2025

Added chat & completions api #10

Are you sure you want to change the base?

Added chat & completions api #10

Conversation

ksriv001 commented Jan 22, 2025

ksriv001 Jan 23, 2025

Choose a reason for hiding this comment

rreece Jan 23, 2025

Choose a reason for hiding this comment

ksriv001 Jan 23, 2025

Choose a reason for hiding this comment

ksriv001 Jan 23, 2025

Choose a reason for hiding this comment

rreece left a comment

Choose a reason for hiding this comment

rreece Jan 23, 2025

Choose a reason for hiding this comment

rreece Jan 23, 2025

Choose a reason for hiding this comment

ksriv001 Jan 23, 2025 • edited Loading

Choose a reason for hiding this comment

ksriv001 Jan 23, 2025

Choose a reason for hiding this comment

ksriv001 Jan 23, 2025 • edited Loading

Choose a reason for hiding this comment

ksriv001 commented Jan 23, 2025

rreece commented Jan 23, 2025 • edited Loading

ksriv001 commented Jan 23, 2025

ksriv001 Jan 23, 2025 •

edited

Loading

ksriv001 Jan 23, 2025 •

edited

Loading

rreece commented Jan 23, 2025 •

edited

Loading